NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Digital medicine and the curse of dimensionality

https://doi.org/10.1038/s41746-021-00521-5

Berisha, Visar; Krantsevich, Chelsea; Hahn, P. Richard; Hahn, Shira; Dasarathy, Gautam; Turaga, Pavan; Liss, Julie (December 2021, npj Digital Medicine)

Abstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.
more » « less
Full Text Available
Automated semantic relevance as an indicator of cognitive decline: Out‐of‐sample validation on a large‐scale longitudinal dataset

https://doi.org/10.1002/dad2.12294

Stegmann, Gabriela; Hahn, Shira; Bhandari, Samarth; Kawabata, Kan; Shefner, Jeremy; Duncan, Cayla Jessica; Liss, Julie; Berisha, Visar; Mueller, Kimberly (January 2022, Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring)

Full Text Available
Estimation of forced vital capacity using speech acoustics in patients with ALS

https://doi.org/10.1080/21678421.2020.1866013

Stegmann, Gabriela M.; Hahn, Shira; Duncan, Cayla J.; Rutkove, Seward B.; Liss, Julie; Shefner, Jeremy M.; Berisha, Visar (July 2021, Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration)

Full Text Available
Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis

https://doi.org/10.1038/s41746-020-00335-x

Stegmann, Gabriela M.; Hahn, Shira; Liss, Julie; Shefner, Jeremy; Rutkove, Seward; Shelton, Kerisa; Duncan, Cayla Jessica; Berisha, Visar (October 2020, npj Digital Medicine)

Abstract Bulbar deterioration in amyotrophic lateral sclerosis (ALS) is a devastating characteristic that impairs patients’ ability to communicate, and is linked to shorter survival. The existing clinical instruments for assessing bulbar function lack sensitivity to early changes. In this paper, using a cohort ofN = 65 ALS patients who provided regular speech samples for 3–9 months, we demonstrated that it is possible to remotely detect early speech changes and track speech progression in ALS via automated algorithmic assessment of speech collected digitally.
more » « less
Repeatability of Commonly Used Speech and Language Features for Clinical Applications

https://doi.org/10.1159/000511671

Stegmann, Gabriela M.; Hahn, Shira; Liss, Julie; Shefner, Jeremy; Rutkove, Seward B.; Kawabata, Kan; Bhandari, Samarth; Shelton, Kerisa; Duncan, Cayla Jessica; Berisha, Visar (December 2020, Digital Biomarkers)
null (Ed.)
Introduction: Changes in speech have the potential to provide important information on the diagnosis and progression of various neurological diseases. Many researchers have relied on open-source speech features to develop algorithms for measuring speech changes in clinical populations as they are convenient and easy to use. However, the repeatability of open-source features in the context of neurological diseases has not been studied. Methods: We used a longitudinal sample of healthy controls, individuals with amyotrophic lateral sclerosis, and individuals with suspected frontotemporal dementia, and we evaluated the repeatability of acoustic and language features separately on these 3 data sets. Results: Repeatability was evaluated using intraclass correlation (ICC) and the within-subjects coefficient of variation (WSCV). In 3 sets of tasks, the median ICC were between 0.02 and 0.55, and the median WSCV were between 29 and 79%. Conclusion: Our results demonstrate that the repeatability of speech features extracted using open-source tool kits is low. Researchers should exercise caution when developing digital health models with open-source speech features. We provide a detailed summary of feature-by-feature repeatability results (ICC, WSCV, SE of measurement, limits of agreement for WSCV, and minimal detectable change) in the online supplementary material so that researchers may incorporate repeatability information into the models they develop.
more » « less
Full Text Available

Search for: All records